full transcript

From the Ted Talk by Chand John: What's the fastest way to alphabetize your bookshelf?

Unscramble the Blue Letters

You work at the college library. You're in the mldide of a quiet aorotnefn when suddenly a shipment of 1,280 different books arrives. The books have been dropped of in one long straight line, but they're all out of order, and the aoamtuitc sorting system is broken. To make matters worse, csealss start tomorrow, which means that first thing in the morning, students will show up in dervos looking for these books. How can you get them all srteod in time? One way would be to srtat at one end of the line with the first pair of books. If the first two books are in order, then lveae them as they are. Otherwise, swap them. Then, look at the second and third books, repeat the process, and continue until you rceah the end of the line. At some point, you'll come across the book that should be last, and keep swapping it with every subsequent book, moving it down the line until it reaches the end where it belongs. Then, start from the beginning and repeat the process to get the second to last book in its proper place, and keep going until all books are sorted. This approach is clelad Bubble Sort. It's simple but slow. You'd make 1,279 comparisons in the first round, then 1,278, and so on, addnig up to 818,560 comparisons. If each took just one second, the process would take over nine days. A second seratgty would be to start by sorting just the first two books. Then, take the third book and compare it with the book in the second spot. If it belongs before the second book, swap them, then compare it with the book in the first spot, and swap again if needed. Now you've sorted the first three books. Keep adding one book at a time to the sorted sub-line, comparing and swapping the new book with the one before it until it's correctly placed among the books sorted so far. This is called Insertion Sort. Unlike bbbule Sort, it usually doesn't require comparing every pair of books. On average, we'd expect to only need to compare each book to half of the books that came before it. In that case, the total number of comparisons would be 409,280, taking almost five days. You're still doing way too many comparisons. Here's a better idea. First, pick a random book. Call it the partition and compare it to every other book. Then, divide the line by pcanilg all the books that come before the potariitn on its left and all the ones that come after it on its right. You've just saved laods of time by not having to compare any of the books on the left to any of the ones on the right ever again. Now, looking only at the books on the left, you can again pick a random partition book and separate those books that come before it from those that come after it. You can keep creating sub-partitions like this until you have a bunch of small sub-lines, each of which you'd sort quickly using another strategy, like Insertion Sort. Each round of partitioning requires about 1,280 comparisons. If your partitions are pertty baeaclnd, dividing the books into 128 sub-lines of ten would take about seven rounds, or 8,960 seconds. Sorting these sub-lines would add about 22 seconds each. All in all, this method known as QuickSort could sort the bokos in under three and a half hrous. But there's a cctah. Your pnrttoaiis could end up lopsided, saving no time at all. Luckily, this rarely happens. That's why QuickSort is one of the most ecifinfet segttiares used by programmers today. They use it for things like sorting items in an oinlne store by price, or creating a list of all the gas stnoatis csole to a given location sorted by distance. In your case, you're done qcuik sitonrg with time to spare. Just another high-stakes day in the library.

Open Cloze

You work at the college library. You're in the ______ of a quiet _________ when suddenly a shipment of 1,280 different books arrives. The books have been dropped of in one long straight line, but they're all out of order, and the _________ sorting system is broken. To make matters worse, _______ start tomorrow, which means that first thing in the morning, students will show up in ______ looking for these books. How can you get them all ______ in time? One way would be to _____ at one end of the line with the first pair of books. If the first two books are in order, then _____ them as they are. Otherwise, swap them. Then, look at the second and third books, repeat the process, and continue until you _____ the end of the line. At some point, you'll come across the book that should be last, and keep swapping it with every subsequent book, moving it down the line until it reaches the end where it belongs. Then, start from the beginning and repeat the process to get the second to last book in its proper place, and keep going until all books are sorted. This approach is ______ Bubble Sort. It's simple but slow. You'd make 1,279 comparisons in the first round, then 1,278, and so on, ______ up to 818,560 comparisons. If each took just one second, the process would take over nine days. A second ________ would be to start by sorting just the first two books. Then, take the third book and compare it with the book in the second spot. If it belongs before the second book, swap them, then compare it with the book in the first spot, and swap again if needed. Now you've sorted the first three books. Keep adding one book at a time to the sorted sub-line, comparing and swapping the new book with the one before it until it's correctly placed among the books sorted so far. This is called Insertion Sort. Unlike ______ Sort, it usually doesn't require comparing every pair of books. On average, we'd expect to only need to compare each book to half of the books that came before it. In that case, the total number of comparisons would be 409,280, taking almost five days. You're still doing way too many comparisons. Here's a better idea. First, pick a random book. Call it the partition and compare it to every other book. Then, divide the line by _______ all the books that come before the _________ on its left and all the ones that come after it on its right. You've just saved _____ of time by not having to compare any of the books on the left to any of the ones on the right ever again. Now, looking only at the books on the left, you can again pick a random partition book and separate those books that come before it from those that come after it. You can keep creating sub-partitions like this until you have a bunch of small sub-lines, each of which you'd sort quickly using another strategy, like Insertion Sort. Each round of partitioning requires about 1,280 comparisons. If your partitions are ______ ________, dividing the books into 128 sub-lines of ten would take about seven rounds, or 8,960 seconds. Sorting these sub-lines would add about 22 seconds each. All in all, this method known as QuickSort could sort the _____ in under three and a half _____. But there's a _____. Your __________ could end up lopsided, saving no time at all. Luckily, this rarely happens. That's why QuickSort is one of the most _________ __________ used by programmers today. They use it for things like sorting items in an ______ store by price, or creating a list of all the gas ________ _____ to a given location sorted by distance. In your case, you're done _____ _______ with time to spare. Just another high-stakes day in the library.

Solution

  1. droves
  2. hours
  3. catch
  4. partition
  5. strategies
  6. classes
  7. sorted
  8. loads
  9. reach
  10. adding
  11. stations
  12. online
  13. sorting
  14. middle
  15. start
  16. balanced
  17. pretty
  18. books
  19. automatic
  20. afternoon
  21. called
  22. efficient
  23. strategy
  24. leave
  25. placing
  26. partitions
  27. quick
  28. close
  29. bubble

Original Text

You work at the college library. You're in the middle of a quiet afternoon when suddenly a shipment of 1,280 different books arrives. The books have been dropped of in one long straight line, but they're all out of order, and the automatic sorting system is broken. To make matters worse, classes start tomorrow, which means that first thing in the morning, students will show up in droves looking for these books. How can you get them all sorted in time? One way would be to start at one end of the line with the first pair of books. If the first two books are in order, then leave them as they are. Otherwise, swap them. Then, look at the second and third books, repeat the process, and continue until you reach the end of the line. At some point, you'll come across the book that should be last, and keep swapping it with every subsequent book, moving it down the line until it reaches the end where it belongs. Then, start from the beginning and repeat the process to get the second to last book in its proper place, and keep going until all books are sorted. This approach is called Bubble Sort. It's simple but slow. You'd make 1,279 comparisons in the first round, then 1,278, and so on, adding up to 818,560 comparisons. If each took just one second, the process would take over nine days. A second strategy would be to start by sorting just the first two books. Then, take the third book and compare it with the book in the second spot. If it belongs before the second book, swap them, then compare it with the book in the first spot, and swap again if needed. Now you've sorted the first three books. Keep adding one book at a time to the sorted sub-line, comparing and swapping the new book with the one before it until it's correctly placed among the books sorted so far. This is called Insertion Sort. Unlike Bubble Sort, it usually doesn't require comparing every pair of books. On average, we'd expect to only need to compare each book to half of the books that came before it. In that case, the total number of comparisons would be 409,280, taking almost five days. You're still doing way too many comparisons. Here's a better idea. First, pick a random book. Call it the partition and compare it to every other book. Then, divide the line by placing all the books that come before the partition on its left and all the ones that come after it on its right. You've just saved loads of time by not having to compare any of the books on the left to any of the ones on the right ever again. Now, looking only at the books on the left, you can again pick a random partition book and separate those books that come before it from those that come after it. You can keep creating sub-partitions like this until you have a bunch of small sub-lines, each of which you'd sort quickly using another strategy, like Insertion Sort. Each round of partitioning requires about 1,280 comparisons. If your partitions are pretty balanced, dividing the books into 128 sub-lines of ten would take about seven rounds, or 8,960 seconds. Sorting these sub-lines would add about 22 seconds each. All in all, this method known as QuickSort could sort the books in under three and a half hours. But there's a catch. Your partitions could end up lopsided, saving no time at all. Luckily, this rarely happens. That's why QuickSort is one of the most efficient strategies used by programmers today. They use it for things like sorting items in an online store by price, or creating a list of all the gas stations close to a given location sorted by distance. In your case, you're done quick sorting with time to spare. Just another high-stakes day in the library.

Frequently Occurring Word Combinations

ngrams of length 2

collocation frequency
insertion sort 2

Important Words

  1. add
  2. adding
  3. afternoon
  4. approach
  5. arrives
  6. automatic
  7. average
  8. balanced
  9. beginning
  10. belongs
  11. book
  12. books
  13. broken
  14. bubble
  15. bunch
  16. call
  17. called
  18. case
  19. catch
  20. classes
  21. close
  22. college
  23. compare
  24. comparing
  25. comparisons
  26. continue
  27. correctly
  28. creating
  29. day
  30. days
  31. distance
  32. divide
  33. dividing
  34. dropped
  35. droves
  36. efficient
  37. expect
  38. gas
  39. hours
  40. idea
  41. insertion
  42. items
  43. leave
  44. left
  45. library
  46. line
  47. list
  48. loads
  49. location
  50. long
  51. lopsided
  52. luckily
  53. matters
  54. means
  55. method
  56. middle
  57. morning
  58. moving
  59. needed
  60. number
  61. online
  62. order
  63. pair
  64. partition
  65. partitioning
  66. partitions
  67. pick
  68. place
  69. placing
  70. point
  71. pretty
  72. price
  73. process
  74. programmers
  75. proper
  76. quick
  77. quickly
  78. quicksort
  79. quiet
  80. random
  81. rarely
  82. reach
  83. reaches
  84. repeat
  85. require
  86. requires
  87. rounds
  88. saved
  89. saving
  90. seconds
  91. separate
  92. shipment
  93. show
  94. simple
  95. slow
  96. small
  97. sort
  98. sorted
  99. sorting
  100. spare
  101. spot
  102. start
  103. stations
  104. store
  105. straight
  106. strategies
  107. strategy
  108. students
  109. subsequent
  110. suddenly
  111. swap
  112. swapping
  113. system
  114. ten
  115. time
  116. today
  117. tomorrow
  118. total
  119. work
  120. worse